Fused TopK and Sigmoid kernel #1251

samremes · 2025-10-23T15:31:11Z

Motivation

Llama4 Maverick uses a custom routing function that isn't using a softmax but only sigmoid: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/llama4.py#L62-L71
Especially in inference that custom routing function done in torch only becomes a significant overhead.

Technical Details

Relies on a PR in composable_kernel: ROCm/composable_kernel#3062
Need to bump 3rdparty/composable_kernel after merging the CK PR.

Test Plan

Added a simple test into op_tests that test both fp16 and bf16 cases.

Test Result

python3 op_tests/test_moe_topk_sigmoid.py 
[aiter] import [module_aiter_enum] under /workspaces/dev/aiter/aiter/jit/module_aiter_enum.so
[W1023 15:37:48.300574905 collection.cpp:1114] Warning: ROCTracer produced duplicate flow start: 1 (function operator())
[aiter] import [module_moe_asm] under /workspaces/dev/aiter/aiter/jit/module_moe_asm.so
[aiter] [checkAllclose atol=0.01 rtol=0.01 passed~]
[aiter] [checkAllclose atol=0.01 rtol=0.01 passed~]
Runtime (torch baseline):     29.784888888888865
Runtime (fused topk sigmoid): 4.163444444444443
Uplift:                       7.15x
[aiter] [checkAllclose atol=0.01 rtol=0.01 passed~]
[aiter] [checkAllclose atol=0.01 rtol=0.01 passed~]
Runtime (torch baseline):     31.291888888888884
Runtime (fused topk sigmoid): 4.296666666666662
Uplift:                       7.28x

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

samremes added 4 commits October 22, 2025 15:17

Add topk softmax

2c6096e

Add test for topk sigmoid

f1da3a3

register the op properly

1a9ff75

apply black

7505a7c

samremes mentioned this pull request Oct 23, 2025

[CK_TILE] Top-K with Sigmoid kernel ROCm/composable_kernel#3062

Open

7 tasks

don't use constexpr with std::string

47d9750

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fused TopK and Sigmoid kernel #1251

Fused TopK and Sigmoid kernel #1251

Uh oh!

samremes commented Oct 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Fused TopK and Sigmoid kernel #1251

Are you sure you want to change the base?

Fused TopK and Sigmoid kernel #1251

Uh oh!

Conversation

samremes commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

samremes commented Oct 23, 2025 •

edited

Loading